On the use of a weighted autocorrelation based fundamental frequency estimation for a multidimensional speech input

نویسندگان

  • Federico Flego
  • Luca Armani
  • Maurizio Omologo
چکیده

The problem of computing the fundamental frequency F0 in an accurate way is a known and still partially unsolved problem, especially given a noisy speech input. In this work, a distanttalking scenario is addressed, where a distributed microphone network provides multi-channel input sequences to process for speaker modeling purposes. Given this context, one may process in an independent way each channel and then apply a majority vote or other fusion methods. Otherwise, the redundancy across the channels can be exploited jointly by processing the different signals to obtain a more reliable and robust F0 estimation. The paper investigates the use of a multi-channel version of a Weighted Autocorrelation(WAUTOC)-based F0 estimation technique. A postprocessing corrective step is introduced to improve the resulting F0 accuracy. Experiments conducted on a real database show the advantages and the robustness of the proposed method in extracting the fundamental frequency with no regard about the microphone and talker position as well as the head orientation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

F0 estimation based on robust ELS complex speech analysis

A robust fundamental frequency (F0) estimation algorithm based on robust ELS (Extended Least Square) complexvalued speech analysis for an analytic speech signal is proposed in this paper. Speech spectrum can be accurately estimated in low frequencies since the analytic signal provides spectrum only over positive frequencies. The remarkable feature makes it possible to realize more accurate F0 e...

متن کامل

Fast fundamental frequency determination via adaptive autocorrelation

We present an algorithm for the estimation of fundamental frequencies in voiced audio signals. The method is based on an autocorrelation of a signal with a segment of the same signal. During operation, frequency estimates are calculated and the segment is updated whenever a period of the signal is detected. The fast estimation of fundamental frequencies with low error rate and simple implementa...

متن کامل

Improving the performance of MFCC for Persian robust speech recognition

The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...

متن کامل

Pitch estimation in noisy speech based on temporal accumulation of spectrum peaks

In this paper, we present a study on robust pitch estimation by integrating spectral and temporal information in speech. Spectrum harmonics are important representations of the speech fundamental frequency. Harmonic-related spectral peaks of speech evolve much more slowly than the spectral peaks of noise. This motivates the proposition of temporally accumulated peak spectrum (TAPS), which is co...

متن کامل

Applications of Surface Correlation to the Estimation of the Harmonic Fundamental of Speech

We present a method for estimating the fundamental frequency of harmonic signals, and apply this method to human speech. The method is based on cross-spectral methods, which provide accurate resolution of multicomponent FM signals in both time and frequency. The fundamental is re-introduced to the spectrum by a frequency-lag autocorrelation of the spectrum, even if the fundamental is completely...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004